This project is inspired by Moneyball: The Art of Winning an Unfair Game, a book by Michael Lewis, published in 2003, about the Oakland Athletics baseball team and its general manager Billy Beane. Its focus is the team’s analytical, evidence-based, sabermetric approach to assembling a competitive baseball team despite Oakland’s small budget (Wiki source).
In 2001 the Oakland Athletics team salary was only 34 million dollars compared to the league leading 112 million dollar New York Yankees.
The discrepancy in team salary is not as drastic in the NBA.
Here I took a web scraping function from mbjoseph to scrape NBA team salaries.
- Web scraping hoopshype.com/salaries…
## [1] "Loaded 30 team salaries."
Here are the (6) largest team salaries:
## Team X2020.21
## 1 Golden State 170722835
## 2 Brooklyn 169052492
## 3 Philadelphia 147840679
## 4 LA Clippers 139722606
## 5 LA Lakers 138326578
## 6 Utah 136048278
Boxplot of team salary:
In 2001 the Oakland A’s paid about 1/2 the median MLB team salary. With this in mind we will attempt to build a competitive team with 1/2 the median NBA team salary. Here we see the median team salary in the NBA is roughly $130 million. That leaves us with 65 million dollars to work with.
Hypothesis
Similar to baseball, we can create a competitive team with a salary of $65M.
What constitutes a competitive team?
Well, that’s not a simple question to answer. We need a team with players who contribute both offensively and defensively. First, we will analyze plus-minus statistics
Plus-Minus
- Plus-Minus, a.k.a. +/-, simply keeps track of the net changes in the score when a given player is either on or off the court.
Real Plus-Minus
Real Plus-Minus can be broken down into offensive and defensive metrics:
Offensive Real Plus-Minus: (ORPM): Player’s average impact on his team’s offensive performance, by the points scored per 100 offensive possessions.
Defensive Real Plus-Minus: (DRPM): Player’s average impact on his team’s defensive performance, by the points allowed per 100 offensive possessions.
RPM Wins
- RPM Wins provide an estimate of the number of wins each player has contributed to his team’s win total on the season. RPM Wins include the player’s Real Plus-Minus and his number of possessions played.
Before analyzing player statistics we need to know how much each player is worth. Let’s load individual player salaries.
- Web scraping hoopshype.com/salaries/players/…
## [1] "Loaded 577 player salaries."
Boxplot of NBA player salaries:
Curious how much your favorite player is being paid this season? Search and see!
- Web scraping espn.com/nba/statistics/…
## [1] "Loaded 10 statistics for 527 players."
## RK Player TEAM GP MPG ORPM DRPM RPM WINS POS
## 1 1 Stephen Curry GS 58 34.1 6.92 -0.10 6.82 17.19 PG
## 2 2 LeBron James LAL 43 33.7 4.66 2.13 6.79 11.76 SF
## 3 3 Rudy Gobert UTAH 65 30.8 -1.36 7.66 6.30 14.82 C
## 4 4 Giannis Antetokounmpo MIL 56 33.1 3.91 1.24 5.14 13.03 PF
## 5 5 Joel Embiid PHI 47 31.5 2.26 2.83 5.09 9.57 C
## 6 6 Paul George LAC 50 33.6 1.84 3.14 4.97 10.46 SG
Add player salary as final column to player stats
## RK Player TEAM GP MPG ORPM DRPM RPM WINS POS X2020.21
## 1 1 Stephen Curry GS 58 34.1 6.92 -0.10 6.82 17.19 PG 43006362
## 2 2 LeBron James LAL 43 33.7 4.66 2.13 6.79 11.76 SF 39219566
## 3 3 Rudy Gobert UTAH 65 30.8 -1.36 7.66 6.30 14.82 C 26775281
## 4 4 Giannis Antetokounmpo MIL 56 33.1 3.91 1.24 5.14 13.03 PF 27528088
## 5 5 Joel Embiid PHI 47 31.5 2.26 2.83 5.09 9.57 C 29542010
## 6 6 Paul George LAC 50 33.6 1.84 3.14 4.97 10.46 SG 35450412
Analyzing RPM vs. Salary
What’s going on?
The chart on the left is a plain scatterplot mess. On the right we can understand the data a little bit more.
- In general, players that play more minutes are paid more and have a higher plus/minus. This might make things difficult for us because we need a competitive team that is cheap!
Time to collect more data
Our Plus/Minus stats come from NBA.com.
Also, we will web scrape the data thanks to some help from our guy Ashwin.
- Web scraping stats.nba.com/stats/…
## [1] "Loaded 65 statistics for 534 players."
Now that we have all the individual player statistics we could ever need to analyze a player’s value we will shift over to compile NBA team statistics.
- Web scraping nba.com/standings…
## [1] "Loaded 81 stats for 30 teams."
## [1] "Cleaned and reduced to 8 stats (including salary) for 30 teams."
Why the team stats?
These stats will be enable us to test how competitively our team stacks up against other teams in the NBA.
## Team TeamName W L PPG OppPPG DiffPointsPG X2020.21
## 1 Utah Jazz 48 18 116.5 107.0 9.4 136048278
## 2 Philadelphia 76ers 45 21 113.9 108.3 5.6 147840679
## 3 Phoenix Suns 47 19 114.6 108.9 5.7 128858241
## 4 Brooklyn Nets 43 23 118.7 114.5 4.2 169052492
## 5 Milwaukee Bucks 42 24 119.6 113.4 6.2 135449418
## 6 Denver Nuggets 44 22 115.1 109.7 5.4 129693210
Team Salary vs. Team Wins
Let’s see if spending more money translates to winning more games.
Looking at the data another way we observe a pattern that might be obvious.
Team Wins vs. Team Diff PPG
Wins and DiffPointPG
- These stats are generally correlated
- Higher DiffPointPG translates to more Wins
- However, higher salary doesn’t always correlate with more wins
- There are lower salary teams with better PPG and Wins!
This finding gives us hope. It opens the door a little bit more for being able to build a winning team on a low budget.
Now what?
Well, we have evidence that it is possible to be a competitive team with a lower end budget. Let’s see how far we can stretch the limits now that we have all the player and team statistics we need for analysis.
First, let’s filter out players…
- With few minutes (we need players!)
- And a Plus/Minus < 200 (we need good players)
- With salaries over $15M (we need affordable players)
## [1] "Now we have 16 players to analyze."
Here is the composition of the team:
##
## C PF PG SF SG
## 1 2 2 2 5
Here is the team:
## Player POS MPG P_M_PG X2020.21
## 1 Georges Niang SF 15.4 4.636364 1783557
## 2 Reggie Jackson PG 23.1 3.500000 2331593
## 3 Donte DiVincenzo SG 27.3 4.967213 3044160
## 4 Mikal Bridges SF 32.8 4.287879 4359000
## 5 Pat Connaughton SG 22.9 3.587302 4938273
## 6 Donovan Mitchell SG 33.4 5.415094 5195501
## 7 Trae Young PG 34.0 3.724138 6571800
## 8 Seth Curry SG 28.7 5.792453 7834449
## 9 Royce O'Neale PF 31.7 6.630769 8500000
## 10 Dario Saric PF 17.2 4.511111 9250000
## 11 Deandre Ayton C 30.6 4.257576 10018200
## 12 Joe Ingles SG 27.6 6.655738 10363637
## [1] "The salary is 74.2 M"
From the analysis of plus/minus we filtered out a team that is slightly outside of our budget and guard heavy. Not bad for a first attempt. Let’s keep going with real plus minus.
Attempt 2: let’s filter out players…
- With few minutes (we need players!)
- And a Real Plus Minus < 1.8 (we need good players)
- With salaries over $15M (we need affordable players)
## [1] "Now we have 14 players to analyze."
Here is the composition of the team:
##
## C PF PG SF SG
## 1 2 3 3 3
Here is the team:
## Player POS MPG RPM X2020.21
## 1 Duncan Robinson SG 31.8 2.88 1663861
## 2 Donte DiVincenzo SG 27.3 2.32 3044160
## 3 John Collins PF 29.6 1.83 4137302
## 4 Mikal Bridges SF 32.8 2.10 4359000
## 5 Bam Adebayo C 33.5 3.12 5115492
## 6 Donovan Mitchell SG 33.4 2.53 5195501
## 7 Trae Young PG 34.0 2.10 6571800
## 8 Luka Doncic PG 35.1 2.83 8049360
## 9 De'Aaron Fox PG 35.1 2.78 8099627
## 10 Kyle Anderson SF 27.3 2.39 9505100
## 11 Jayson Tatum SF 35.8 2.43 9897120
## 12 Zion Williamson PF 33.2 3.23 10245480
## [1] "The salary is 75.9 M"
From the analysis of real plus/minus, again, we filtered out a team that is slightly outside of our budget, but looking more like a normal lineup. Not bad for a second attempt. Let’s keep going with real plus/minus wins.
Attempt 3: let’s filter out players…
- With few minutes (we need players!)
- And a RPM WINS < 5 (we need good players)
- With salaries over $15M (we need affordable players)
## [1] "Now we have 26 players to analyze."
Here is the composition of the team:
##
## C PF PG SF SG
## 1 1 3 2 5
Here is the team:
## Player POS MPG WINS X2020.21
## 1 Duncan Robinson SG 31.8 9.37 1663861
## 2 Kevin Huerter SG 31.0 5.18 2761920
## 3 Donte DiVincenzo SG 27.3 6.90 3044160
## 4 John Collins PF 29.6 6.00 4137302
## 5 Reggie Bullock SF 28.9 5.24 4200000
## 6 Mikal Bridges SF 32.8 8.05 4359000
## 7 Bam Adebayo C 33.5 8.95 5115492
## 8 Donovan Mitchell SG 33.4 7.60 5195501
## 9 Trae Young PG 34.0 7.44 6571800
## 10 Luka Doncic PG 35.1 9.50 8049360
## 11 De'Aaron Fox PG 35.1 9.12 8099627
## 12 RJ Barrett SG 34.7 5.17 8231760
## [1] "The salary is 61.4 M"
Ayyoo!
Now we are in business. Out of the 24 players selected we chose to stick with the 12 players that were the least expensive. As of May 6th, we beat our goal by $3.6 million. This is subject to change because the web scraping pulls live data and the season is still underway.
Now, the real question:
Can this newly created $Bball team compete with current teams?
- Let’s test by pro-rating RPM WINS vs. existing league leaders’ RPM WINS
## [1] "The top team is: Utah with 48 wins and 18 losses."
Our team will play a similar style to other NBA teams:
- players 1-3 will play 34 minutes per game (mpg),
- players 4-8 will play 24 mpg,
- players 9 & 10 will play 9 mpg,
- and players 11 & 12 will be reserves.
## [1] "Our $Bball team, with a salary of 61.4M, has a projected W-L record of 61-5!!!"
Success!
We are on track to head into the playoffs as the number 1 overall team! At this rate we may even post the best record of all-time.